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Abstract. - We investigate the correlation properties of transaction data from the New York 
Stock Exchange. The trading activity fi{t) of each stock i displays a crossover from weaker to 
stronger correlations at time scales 60 — 390 minutes. In both regimes, the Hurst exponent H 
depends logarithmically on the liquidity of the stock, measured by the mean traded value per 
minute. All multiscaling exponents T{q) display a similar liquidity dependence, which clearly 
indicates the lack of a universal form assumed by other studies. The origin of this behavior is 
both the long memory in the frequency and the size of consecutive transactions. 



Financial markets are self-adaptive complex systems and their understanding requires in- 
terdisciplinary research, including the application of concepts and tools of statistical physics. 
The success of modern statistical physics lies to a large extent in explaining phenomena from 
phase transitions to far-from-equihbrium processes, where the two key concepts have been 
scaling and universality. When apphed to physical systems, both can rely on a solid founda- 
tion: renormalization group theory. But how reliable can insights based on these principles 
be, if we move on to social or economic systems? According to economists, "physicists [simply] 
suffer from the belief that there must be universal rules" [1] . 

The aim of present paper is to point out that the assumption of universality can lead 
to false conclusions regarding stock market dynamics [2]. Wc use multifractal analysis - an 
approach very commonly pursued in econophysics - to point out that the size of the company, 
or more appropriately the liquidity of its stock, affects the observed characteristics of how it 
is traded on the market. This dependence is continuous, and therefore it means the absence 
of universality classes in trading dynamics. 

By means of multifractal analysis, we show that: (i) Trading activity records show a 
crossover from weaker to stronger correlations around the time scale of 1 trading day. (ii) 
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The strength of correlations above the crossover depends logarithmicaUy on the average trad- 
ing activity of the stock, (iii) The whole family of T{q) multiscaling exponents of trading 
activity shows a similar variation, (iv) These effects originate from an interplay between the 
autocorrelations of the frequency and the size of consecutive transactions. 

The dataset used in our study was taken from the NYSE TAQ database [3] , and it contains 
the records of all transactions of the New York Stock Exchange in the years 2000 — 2002. 

Let us denote the total traded value of a stock i in the time window [t, t + At] by 
This is calculated as a sum of the values of all transactions for the given stock during [t, t + At]. 
If Nf"* (t) denotes the number of trades for the stock in the interval, and the value of the n-th 
trade is Vi{n), then one can write this formally as 

tj(n)e[t,t + At] 

where the sum runs for the Nr{t) trades in the interval. 

Though the returns are known to be only short time correlated, financial data contain 
different kinds of long-range correlations, examples range from volatility to order flow [4-6]. 
Records of traded value are no exception from this [7-9] , and are most often characterized by 
the Hurst exponent, or in general, by multifractal spectra [10,11]. Multifractal models repre- 
sent a dynamically developing approach in describing financial processes both in conservative 
finance and econophysics (for a review, see [12]). 

Recent studies [7, 13] have shown that the standard deviation, and even higher moments 
of / exist, thus it is possible, to define the q-th order partition function in the following way: 

aliAt) = - (f^it))]') cx Ai-f''), (2) 

where (•) denotes time averaging. For any fixed stock i, the formula defines a Ti{q) set of 
exponents, indexed by q, and determined by the slopes of eq. (|2Jl on a log-log plot (^). These 
are often written in the form r(q) = qH{q), and H — H{2) is called the Hurst exponent, while 
other H{qys are the generalized Hurst exponents. This family of exponents is closely related 
to the correlation properties of the data. li H = 0.5, the data have no long range correlations, 
while for H > 0.5 {H < 0.5) signals have persistent (antipersistent) long range correlations. 
If H{q) = H is independent of q, the signal is self-affine, while nontrivial g-dependence gives 
rise to multiscaling or multi-affinity. 

Here, we present an analysis of the a'^{At) partition functions. We investigated the 2416 
stocks which were continuously listed at NYSE during the years 2000 — 2002, and which had 
an average turnover (/) (mean traded value per minute) of at least 100 USD/min. This 
ensures that there are no extended periods where the stock is not traded at all, and thus f{t) 
is well-defined. 

For the calculation of (T^(At) we used Detrended Fluctuation Analysis [11]. This method 
uses piecewise polynomial fits to remove instationarities from the data, and often produces 
good estimates for r(q). We tested the robustness of our estimates to the order of this 
detrending, and varied the order of the polynomials from 1 to 5, but the results did not 
change significantly. 

Then, we divided the stocks into five groups with respect to (/): those with 10^ USD/min 
< (/) < 10^ USD/min, those with 10^ USD/min < (/) < 10^ USD/min, . . . , and finally 10*^ 
USD/min < (/). Then, we averaged the a^{At) partition functions within each group (^). As 
an example, the results for q = 2 are shown in fig. ^a). 

(^)Note that throughout the paper we use 10-base logarithms. 

(2)This averaging procedure decreases the noise present in the data, without affecting our main conclusions. 
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Figure 1 - (a) The normalized partition function i log a? (At) — i log for the five groups of com- 
panies. A horizontal line would mean the absence of autocorrelations in the data. Instead, one 
observes a crossover phenomenon in the regime At = 60 — 390 mins. Below the crossover all groups 
show weakly correlated behavior. Above the crossover (fluctuations on daily and longer scales), the 
strength of correlations, and thus the slope corresponding to — 0.5, increases with the liquidity 
of the stock. Note: Fits are for the regimes below 60 min and above 1700 min. (b) The values of 
the scaling exponents T^{q), valid for time scales longer than a trading day, for the five groups of 
companies. Companies with higher average traded value exhibit stronger correlations, and weaker 
multiscaling than their smaller counterparts. Correspondingly, their T^{q) is greater, and the shape 
of the curve is closer to the linear relationship T"'"(g) — q. 

One finds that, regardless of group, the logcr'(Ai) versus log At plots are not straight 
lines. Instead, one observes a crossover phenomenon {^) [7,8]: There are two regimes of At 
for which different T(q)-s can be identified. For At < 60 min, we are going to use the notation 
T^{q), while for At > 390 min, T^{q)- One can define the related generalized Hurst exponents 
as T^{q) = qH^{q). Systematically, H^{q) > H^{q), which means that correlations become 
stronger when window sizes are greater than 390 min. 

Moreover, there is remarkable difference between groups when At > 390 min. This means 
that the correlations present in the day-to-day variations of trading activity systematically 
depend on (/), as seen from H^{q) values indicated in fig. ^a). More of this dependence can 
be understood if one examines the scaling exponents for more powers of q. This was done by 
first evaluating the value of T^{q) for the independent stocks, and then averaging that for the 
elements within each group. The results are shown in fig. ^b). The plot implies, that more 
liquid stocks (greater (/}) display stronger correlations than their less liquid (smaller (/}) 
counterparts, for any order q > 0. This is realized in a way that the degree of multiscaling 
decreases, and the scaling exponents tend to the fully correlated self-affine behavior with the 
limiting exponents T+(g) = q, H^{q) = 1. 

Also note that data were first corrected by the well-known U-shape pattern of daily trading activity (see, e.g., 
ref. [14]), calculated independently for each group. 
(^)The fact, that the properties of stock market time series are different on time scales shorter than and longer 
than 1 trading day, was pointed out by many sources. The most common examples is are the distribution of 
returns and the autocorrelations of volatility [4,5]. 
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Figure 2 - (a) The values of the generahzed Hurst exponents H^[q), vahd for time scales longer 
than a trading day, for the five groups of companies. The difference in the strength of correlations, 
and thus H^{q), is present for all powers q. This implies, that such a dependence on liquidity is 
present in both low and high trading activity periods, (b) The values of H^{q), from top to bottom 
q = —1, f , 2, 3, 4, 5. The points represent the average value for one of the five groups of companies. 
One can see, that H'^{q) changes in an approximately logarithmic fashion with (/). Note: Stocks 
grouped by {/), increasing from bottom to top, ranges given in USD/min. 

Fig. I2a) shows the corresponding values of (q) . The difference in the iJ+ (q) 's between 
the groups is present throughout the whole range of g's, not only for large g's which are 
sensitive to the high trading activity. This indicates that the higher level of correlations in 
more liquid stocks cannot be exclusively attributed to periods of high trading activity. Instead, 
it is a general phenomenon, that is present continuously ('*). 

Despite the presence of non-universality, and that T{q) depends on the liquidity of the 
stock, there is a clear systematic way how this dependence is. In fig. Il^b), we plot vertical 
"cuts" of fig. |2Ia). These show, that for a fixed value of q, T+(q) increases with (/) in an 
approximately logarithmic way: 

r+(<z;(/)) = C(g)+7(9)log(/), (3) 

where 7(9) « 0.04 - 0.06. 

Our results imply that the trading of assets of companies with very different size and 
liquidity cannot be described in a universal manner (^). There have been studies pointing out 
such asset-to-asset variations, and the key role of liquidity [16-19], however, they have been 
consistently overlooked by some econophysics groups. There is a wide range of studies, that 
calculate ensemble averages over a large number of stocks, irrespective of their liquidities. In 
some cases universality seems indeed to hold, like for the normalized distribution of returns 

(^)One may notice, that there is a strong deviation in the case of stock with low liqudity, and q < —1. The 
origin of this artifact is a finite size eff'ect: The stocks are traded in lots of 100, and thus they cannot be traded 
in values less than pricexlOO. This minimum acts as a cutoff in small fluctuations, to which q < —1 moments 
are very sensitive. 

(^)A recent preprint [9] shows similar effects with respect to the market where the stocks are traded. More 
indications of similar behavior can be found in refs. [7, 15] 
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Figure 3 - (a) The normalized partition function i log a%i (At) — | log At for the five groups of 
companies. A horizontal line would mean the absence of autocorrelations in the data. The crossover 
regime is for slightly longer times, At « 160 — 1200 min. Above the crossover the strength of 
correlations in A*', and thus the slopes corresponding to — 0.5, increase with the liquidity of the 
stock, increases with the liquidity of the stock, (b) Same as (a), but for i log ayi{N/ (N)) — | log N. 
The darker shaded area corresponds to the crossover regime of / at N/ (N) ~ 60 — 390 mins. Small 
stocks are traded infrequently, therefore they have no data points below the crossover. Note: Stocks 
grouped by (/), increasing from bottom to top, ranges given in USD/min. In both plots, fits are for 
the regimes below 60 min and above 1700 min. 



[20,21]. However, in other cases, as we have just seen, it is misleading to calculate averages 
for stocks with a wide range of liquidity as done in, e.g., refs. [22-25]. A "typical" T{q) or 
multifractal spectrum of assets is not meaningful in the presence of this clear, systematic 
dependence. 

What aspect of the trading dynamics is the origin of this non-universality? As eq. 
suggests, the source of fluctuations in / is the fluctuation of N and V (see also ref. [26]). Thus, 
it is very instructive to define the Hurst exponents of these two processes in analogy with eq. 
(|2Jl. We restrict ourselves to the q — 2 moment. One can introduce the i/jvi Hurst exponent 
of the time series Nf^^{t) as 

aUAt) = ((A^f *W - (A^f *))') « ^t'""'- (4) 

This Hn describes the temporal correlations of the number of trades. 

The results for the group averages of cr^j, and the asymptotically valid exponents i?^- 
are shown in fig. EJa). A comparison with [fig. da)] shows that both quantities behave 
similarly: Fluctuations in the number of trades A'^ display crossover and liquidity dependence 
in the strength of correlations, just like /. 

The Hvi Hurst exponent of the so-called tick- by-tick data Vi (n) can be defined as 

aUN/ W:)) = ( ( E - ( E W ) 1 {N.))''''''- (5) 
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Hv 


R- 


Hn 




H+ 


Ht, 


10"- 


0.57 


0.60 


0.68 


0.88 


0.89 


0.87 


10^ - lO*' 


0.56 


0.57 


0.65 


0.84 


0.83 


0.87 


10'' - 10^ 




0.56 


0.63 


0.81 


0.80 


0.85 


10^ - 10" 




0.55 


0.60 


0.75 


0.77 


0.82 


10^ - 10^ 




0.54 


0.56 


0.65 


0.70 


0.72 



Table I - Hurst exponents for f, N and V for the 5 groups of stocks. For all groups, every exponent is 
higher above the crossover than below it. Moreover, above the crossover there are no large differences 
between Hy , H'^ , Hy. From the fits, the errors are estimated to be ±0.03. Note/ Hy is not defined 
for the 3 groups, whose stocks are not traded at least every 10 minutes. 



The important point here is that the scahng variable is the N number of consecutive trades. 
This is divided by the (Ni) mean number of trades per minute. This is crucial, because 
the trading frequency of the stocks varies over many orders of magnitudes. Thus N trades 
corresponds a different time span depending on trading frequency, i.e., on the stock. The 
scaling variable N/ (Ni) has a dimension of minutes (just like At), and its fixed value always 
means the same time window size, regardless of the stock. 

Moreover, when applying eq. (jsj, there is a natural lower limit in window size: one cannot 
take less than one trade, and so > 1. Consequently, a group average for ay^ is undefined, 
where the scaling variable would be N/ (Ni) < 1/ (Ni) for any stock in the group (^). For 
more liquid stocks, {Ni) is larger, thus the minimal window size is smaller. 

The results are shown in fig. Olb). Hy is only defined for the two groups, whose stocks 
are traded at least every 10 minutes, and they indicate weak or no liquidity-dependence. Hy 
exists for all groups and follows the same trend of increasing correlations for greater liquidity. 

The number of transactions in a given time window [t, t + At] is - to a good approximation 
- independent from the value of the single transactions (J). Under this condition, one can 
show that for any stock i: 

aUAt)^al,{At){V.f+a'y,{{N,^f (6) 

where (Vi) is the mean, and ay^ is the standard deviation of the value of individual transac- 
tions. The origins of the two terms in the formula are the following [26]: 

1. The first term describes the effect of fluctuations in the number of transactions. Let 
us assume, that the size of the transactions is constant, so Vi{n) = (Vi), and ay^ = 0. 
Then, the second term is zero, and eq. © simplifies to crjf,{At) {V,) . 

2. The second term describes the effect of fluctuations in the value of individual transac- 
tions. If one assumes that the number of transactions is the same in every time window, 
Nf'*-{t) = (Nf"*), then = 0. The first term becomes zero, and eq. © reduces to 

Thus the correlations in / originate from the correlations in N and V. By definition, 
the l.h.s. of eq. © is proportional to At^^^ . The first term on the r.h.s. is proportional 
to At^^^% while the second term can be estimated to scale as At'^^^'. For large At, the 
behavior of tr^ is dominated by the larger of i?Ar and Hy. 

(®)We allowed up to 10% of such missing data. 

(^)This means, that N^^{t) is independent from f it) / N f^\t) . The B? values of regressions between the 
logarithms of these two quantities are typically of the order 0.03 in the data. 
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This is in agreement with the results summarized in table ^ The table also shows that 
above the crossover there are no major differences between iJ, Hiq and Hy- This means that 
neither of the two processes dominates in general. 

We have studied the correlation functions (0), Q and ^ found the following: (i) There 
exists a crossover in the behavior as a function of the time window at At « 1 trading day; 
(ii) There is non-universal (multi)scaling for large At with a systematic dependence of the 
exponents on the liquidity; (iii) Eq. ^ points out the interplay between the fluctuations in the 
number of trades and the tick-by-tick values, resulting in the observed long term correlations 
for the activity. While we emphasize the non-universal character of the exponents, we also 
mean to underline the systematic trends as a function of the company size (liquidity). These 
properties of trading should be addressed by the future modeling efforts of the stock market. 

The authors are grateful to Gyorgy Andor for his help with financial data. JK is member 
of the Center for Applied Mathematics and Computational Physics, BME. Support by OTKA 
T049238 is acknowledged. 
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